Skip to content

Conversation

@jiahanc
Copy link
Collaborator

@jiahanc jiahanc commented Oct 20, 2025

📌 Description

  • update the trtllm-gen fused moe headers
  • add new kernels for trtllm-gen fused moe
    • for NvFp4, add tile 256
    • for MxFp8 x MxFp4, add 128, 256
    • for FP8 per-tensor, add 192, 256
    • for FP8 block scale, add 128
  • update the logics of computeSelectedTileN
  • add tune_max_num_tokens to FP8 per-tensor and FP8 block scale

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 20, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jiahanc jiahanc force-pushed the updateTileNCalc branch 2 times, most recently from 9d9ad95 to 7cd156d Compare October 22, 2025 22:21
Signed-off-by: Siyuan Fu <[email protected]>
Signed-off-by: Siyuan Fu <[email protected]>
Signed-off-by: Siyuan Fu <[email protected]>
@IwakuraRein IwakuraRein changed the title Revise the calculation related to TileN in routing of MOE TRTLLM backend Update trtllm-gen fused moe routing kernel and add more kernels Nov 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants